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Abstract 


Traditionally climate changes have been detected from long series of observations and long 
after they have happened. Our "inverse sequential" procedure, for detecting change as soon as it 
occurs, describes the existing or most recent data by their frequency distribution. Its parameter® 
arc estimated both from the existing set of observations and from the same set augmented by , , 
j new observations. Individual-value probability products ("likelihoods”) are used to form ratios 
which yield two probabilities for erroneously accepting the existing parameters) as valid for the 
data set, and vice versa. A genuine parameter change is signalled when these 
probabilties (or a more stable compound probability) show a progressive decrease. New parameter 
values can then be estimated from the new observations alone using standard statistical techniques. 

The inverse sequential procedure will be illustrated for global annual mean temperatures 
(assumed normally distributed), and for annual numbers of North Atlantic hurricanes (assumed to 
represent Poisson distributions). The procedure has been developed, but not yet tested, for linear 
or exponential trends, and for chi-square means or degrees of freedom, a special measure of 
autocorrelation (Radok, 1992). 
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I. Introduction 

The detection of changes in a developing time series requires some idea of what form they 
are likely to take. When the nature of the forcing is known, filters can be designed that will show 
their effects most clearly (Kim and North, 1991), but that knowledge is often not available in the 
geophysical sciences. There are many time series which can be viewed as potentially 
inhomogeneous, made up of irregular-length sections each of which differs from its neighbors in 
one or more of the parameters that define its signal and noise characteristics. As long as its 
parameters remain unchanged, an individual section can then be said to be in "statistical control" 
(Shewhart, 1939). 

There exists considerable evidence that this concept is realistic in many geophysical 
contexts, for instance those exhibiting the "Hurst phenomenon" much discussed in hydrology 
(e.g., Klemes, 1974). With its minimum of arbitrary assumptions, the concept of statistical 
control suggests a general monitoring approach that registers the length and end of each controlled 
state, together with the new parameter values. The magnitude of changes in geophysical 
parameters cannot be anticipated, but their surveillance might use a probability for regarding the 
parameters established from existing observations as significantly changed by the addition of one 
or more new observations. 

Such a "sequential" use of accruing information was pioneered by Wald (1945) and has 
developed into a large special field of statistics (cf. e.g.. Gosh, 1988) which includes a range of 
procedures utilizing cumulative sums ("cusum" techniques; e.g., Goel, 1982)). The typical 



outcome in the simplest situation is a decision, with prescribed error probabilities, to accept one of 
two specified parameter values, or to continue sampling. 

The "inverse" sequential approach here presented instead progressively determines "no- 
change" probabilities for parameter estimates based, respectively, on the accrued data and on the 
same data augmented by one or several new observations. A parameter change is then signaled 
when these probabilities begin decreasing to small values. 

The basic relations for such a procedure are developed in section 2 and formulated for the 
means and variances of Gaussian and Poisson variates in Appendix A; mathematical derivations 
can be found in a paper submitted for publication in the American Statistician, and in a project 
report in preparation which will include computer programs for performing the calculations. These 
are illustrated in section 3, and a well-known way of combining the probabilities for several 
parameters into a change "fingerprint" is recalled in the last section. 

2. Theory 

Consider a series of m observations x - , i = l,2,...m, to which further,/' observations are 
added (j = 1,2,...). For a parameter 0 (such as mean, variance, trend, etc.) the first m values yield 
an optimum estimate 6 m which the additional j observations change to 0 m+ j. Writing the 
corresponding probabilities of individual x as p m and p m+j , respectively, the likelihood function of 

m m 

the first m observations is I~Ip m = L m (m) when 6 = 6 m , and Up n+J = L m+ fm) when 9 = 6 m+ - . 

Here the bracketed number indicates the number of observations in the product, while the subscript 
is the number of observations used for the parameter estimate. The likelihood ratio q(m) = 

L m+J (m) / L m (m) < 1 if 0 m represents an optimum estimate for the m observations. In the same 
way we define L m+J (m+f) and L m (m+j) for the m+j observations which yield a likelihood ratio 
qim+j) = L m+ j(m+f) / LJm+j) > 1. The likelihoods and their ratios provide the elements for a 
formal test of two hypotheses. The first, H(m ), states that 9=9 m for the existing m observations, 
while the second, is 0 = 0 for the augmented set of m+j observations. 



Integration of the likelihood function L m im) over its m-dimensional sample space Rim) 
gives the probability of accepting the hypotheses Him) when true as 1- a , where a is the 
probability that H(m) will be rejected when the sample point falls into a remote "critical’ rejection 
region of the sample space, even though Him) remains true there ("type I error"). The 
corresponding integration of L m+ j(m) leads to a second probability, p , that H(m+j) is erroneously 

rejected when the sample point falls in the same region; this is also the probability of accepting the 
hypothesis Him) when false ("type II error"). The integrated likelihood ratio for the m 
observations thus becomes 
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Applying the same argument to the augmented set of m+j observations leads to 
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Disregarding the slight difference between the critical regions of R(/n) and R(m+f) for the two data 
sets, we can approximate a " , the probability of rejecting H{m+j), when true, by /3, the probability 
of accepting Him) when false; on a similar argument, ft = a , so that equation (2) takes the 
approximate form 

q*im+j) * — ^ - (3) 
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For the inverse sequential procedure we replace the q* by the observed sample values q of the 
likelihood ratios, and from (1) and (3) obtain two relations for estimating a and ft : 

a - o 
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Equations (1) and (3) state the familiar decision limits of Wald's (1945) sequential probability ratio 
test (SPRT). Our argument in effect places the likelihood ratio qim) on Wald's lower decision 
limit, and the ratio qim+j) on different upper decision limits. But in contrast to a SPRT, now those 
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limits involve known likelihood ratios q{m) and q(m+j) and unknown probabilities a and . A 
definite change of control, from Q m to 6 m+ j , is signalled when both probabilities decrease to small 

values. 

In practice rounding errors can raise the likelihood ratio q{m) to values larger than unity and 
similarly lower q(m+j) to values below one. Equations (4) then give unrealistic probabilities that 
are negative or larger than 1 . Such q values may be replaced by 1, giving the probabilities the 
values 0 and 1, respectively, (or 0.5 if both q are taken as 1). 

For monitoring the average of the two error probabilities can be used, but a more robust 

single no-change probability is defined by the ratio q(m+J)/q(m). We write 

a(m*i) (1-«)(I-/S) n ,,, 

q(m) - a p <5> 

Taking square roots and solving for /leads to 

7-U+Vj®- 1 . (6) 

The probability /remains between 0 and 1/2 for q(m+j) > q(m), and can be shown to fall between 
the arithmetic and geometric means of the two probabilities defined by (4a) and (4b). 

Inverse sequential formulae for q(m) and q(m+j) are given in appendix A. The next section 
illustrates their use for monitoring changes in Gaussian and Poisson means and variances. 

3. Applications 

The procedure developed in section 2, and explicitly formulated in appendix A, tests the 
"null hypothesis" that the originally available data in question remain homogeneous as new data are 
added. A developing inhomogeneity becomes apparent first as a progressive decrease in the "no- 
change" probability y, but that decrease will only continue all the way to small values when the 
parameters for the augmented (original plus new) data differ significantly from those valid for the 
original data alone. Clearly parameter estimates derived solely from the new data will show such 
differences well before the augmented set can do so. Therefore, the inverse sequential test is 
terminated as soon as a systematic decrease in /has been firmly established; standard statistical 

procedures can then be used to compare the parameters of the original data with those derived from 



the new data that caused the probability decrease. That final step is omitted in the examples that 
follow since its result in general must be assessed by geophysical considerations as well as by its 
statistical significance. 

As a first application we attempt to detect changes of mean and variance in two series of 
global mean temperature anomalies (deviations from the long-term mean 1958-77) reported by 
Angell and Korshover (1987; updated in Boden et al., 1990). Figure la shows these data for the 
surface and Figure 2a for the upper troposphere/lower stratosphere (the layer between the 100 hPa 
and 300 hPa constant-pressure surfaces). 

The three probabilities for the surface observations are given in Figure lb; they suggest a 
change in control around 1980. The test is then continued with the new larger mean and variance 
based on the observations for the years 1979-83; no further control changes are evident from the 
remaining data. 

The probabilities for the temperature anomalies of the upper troposphere/lower stratosphere 
are given in Figure 2b. No changes of control can be discerned, although the mean decreased 
slightly from its initial value towards the end of the period of record used. 

A second application of the inverse sequential procedure uses the annual numbers of 
tropical hurricanes recorded for the North Atlantic by Case (1988; updated to 1990) as shown in 
Figure 3a. The frequency distribution of these numbers for the period 1931-1990 (Figure 4) 
broadly conforms to a Poisson distribution with a mean of 5.6 (dashed lines in Figure 4). 

Figure 3b gives the three error probabilities. A weak change of control is suggested to 
have occurred around 1940, with a decrease in the mean number to 3.8, followed by a more 
distinct change to a mean number of 7.6 around 1950. A renewed decrease back to the original 
mean number of 5.6 hurricanes per year is suggested by the gradual decrease of yin the early 
1960s. The remaining data show no further changes of control, even when a new base period is 
adopted in the 1970s in order to sharpen the test. 


4. Conclusion 


The inverse sequential procedure here described represents a new approach to the 
monitoring of time series, and clearly requires further experimentation and development 
Mathematical details have already been formulated for detecting changes in linear and exponential 
trends, and in the means of chi-square variates which also represent their degrees of freedom and 
can be used as a measure of autocorrelation (Radok, 1992). We plan to apply the full procedure to 
the geophysical data provided in CD-ROM format by NASA under the Greenhouse Effect 
Detection Experiment (GEDEX; Schiffer and Unimayar, 1992; Olsen and Wamock, 1992). 
Another data archive to be tested is the Comprehensive Ocean Atmosphere Data Set (COADS; 
Woodruff et al., 1987; Diaz and Brown, 1992). 

As further steps in the procedure, the lengths of statistically controlled sections can 
themselves be analyzed as a potential Poisson variate, and the independent probabilities obtained 
for different variables can be combined following Fisher (1941, section 21.1) to construct 
"fingerprints " of climatic change in the form of chi-square variates with 2k degrees of freedom, 

< 6 > 

t- 1 

where k is the number of independent probabilities combined. 
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Figure captions 


Figure 1. Inverse sequential test applied to global surface temperature anomalies. 

a) Annual mean deviations °C from 1958-77 mean (Angell and Korshover, 1987; updated in Boden 
et al., 1990). 

b) Probabilities that no parameter changes are occurring. For symbols see text. 

Figure 2. Inverse sequential test applied to global temperatures of the upper troposphere/lower 
stratosphere (100-300 hPa layer). 

a) Annual mean deviations °C from 1958-1977 mean (Angell and Korshover, 1987; updated in 
Boden et al., 1990). 

b) Probabilities that no parameter changes are occurring. For symbols see text. 

Figure 3. Inverse sequential test applied to hurricane numbers. 

a) Annual number of North Atlantic hurricanes, 1931-1990 (Case, 1988; updated). 

b) Probabilities that no change in the mean number is occurring. For symbols see text. 

Figure 4. Frequency histograms of hurricane numbers (solid lines) and Poisson distribution with 
mean 5.6 (dashed lines). 



Appendix A: Inverse sequential formulae for means and variances of random 
samples from Gaussian and Poisson distributions 

The restrictions to these distributions imply a need to verify that the data are indeed so 
distributed, and to perform an appropriate transformation if they are not (as described by e.g. 
Curtiss, 1943). The formulae give the basic probability p in the likelihood functions for m and 
m+j observations, and the likelihood ratios q(m) and q{m+j) used to calculate the probabilities a, 

A and yfrom equations (4) and (6) in section 2. Subscripts indicate the number of values used for 
parameter estimates, and bracketed symbols give the numbers used to calculate the likelihoods and 
their ratios. 


(1) Gaussian mean and variance 
The basic probability, 
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involves two parameters which cannot be separated in the test since in the present context neither 

parameter is prescribed. As an estimate for the distribution mean p we use the sample mean x n \ 
the distribution variance is estimated as c£ = [n/(n - 1 )]s*, where is the sample variance and n 


(=m or m+j) the number of values used for the estimates. Then the two likelihood ratios are given 
by 


q(m) 


, o_ 
= exp m log — — 

L <V/ 


m- 1 , 


1 


Pm+yJ 


m 




m+j ~ 


m+j 


» 


and 


Q(rn+j) 


= expj^ 


{m+j) log 


J m+j 


m+j- 1 

/°m+/ 

2 

t<£ 


) 


(2) Poisson mean (=variancei 

This case is simpler because the basic probability. 
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has only a single parameters, the mean number of occurrences. The likelihood ratios are 
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